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Abstract 


We analyze adolescent BMI and middle-age systolic blood pressure (SBP) repeat- 
edly measured on women enrolled in the Fels longitudinal study (FLS) between 1929 
and 2010 to address three questions: Do adolescent-specific growth rates in BMI and 
menarche affect middle-age SBP? Do they moderate the aging effect on middle-age 
SBP? Have the effects changed over historical time? To address the questions, we 
propose analyzing a growth curve model (GCM) that controls for age, birth-year co- 
hort and historical time. However, several complications in the data make the GCM 
analysis non-standard. First, the person-specific adolescent BMI and middle-age SBP 
trajectories are unobservable. Second, missing data are substantial on BMI, SBP and 
menarche. Finally, modeling the latent trajectories for BMI and SBP, repeatedly mea- 
sured on two distinct sets of unbalanced time points, are computationally intensive. 
We adopt a bivariate GCM for BMI and SBP with correlated random coefficients. To 
efficiently handle missing values of BMI, SBP and menarche assumed missing at ran- 
dom, we estimate their joint distribution by maximum likelihood via the EM algorithm 
where the correlated random coefficients and menarche are multivariate normal. The 
estimated distribution will be transformed to the desired GCM for SBP that includes 
the random coefficients of BMI and menarche as covariates. We demonstrate unbiased 
estimation by simulation. We find that adolescent growth rates in BMI and menarche 
are positively associated with and moderate the aging effect on SBP in middle age, 
controlling for age, cohort and historical time, but the effect sizes are at most modest. 
The aging effect is significant on SBP, controlling for cohort and historical time, but 


not vice versa. 
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1 Introduction 


The Fels longitudinal study (FLS) collected the lifetime repeated measurements on growth, 
health and body composition of 2,567 participants enrolled in yearly cohorts of 20 to 35 from 
1929 to 2010. The participants were scheduled to be examined every six months for the first 
18 years and every two years for the rest of their life spans. Each examination took extensive 
anthropometric measurements and recorded a health inventory (Sun et al. 2007, 2008). Via 
a growth curve model (GCM) with person-specific growth trajectories, also known as a hier- 
archical, multilevel, random coefficients or linear mixed model (Raudenbush and Bryk 2002; 
Goldstein 2003), we link growth in adolescent BMI and person-specific (age at) menarche 
to the course of middle-age systolic blood pressure (SBP) for 835 female participants where 
459 women were enrolled at birth producing 85% of the repeated measurements while others 
were family members, spouses and relatives. In this paper, we consider BMI and SBP as the 
biomarkers of obesity and health, respectively. 

Figure 1 plots all observed BMI and SBP of the participants against age where 5,694 
adolescent BMIs and 1,196 middle-age SBPs are nested within 552 and 436 individuals, re- 
spectively. It superimposes the scatterplots with person-specific longitudinal spaghetti plots. 
BMI and SBP are standardized to have mean 0 and variance 1. Every participant has at 
least one observation in adolescence or middle age. The spaghetti plots reveal that some 
growth patterns in BMI are higher overall, accelerate earlier or decelerate earlier than others. 
We analyze the impact of the growth patterns on progression of middle-age SBP. Because 
the 835 female participants consist of 98% European Americans and a tiny fraction of mi- 
nority participants including African American, Asian, multiracial and other individuals and 
because the minority effect is not significant on either BMI or SBP outcome in preliminary 
analysis, we do not consider the race covariate. 

The examination schedule implies that those enrolled at age 10 or earlier can have up 
to 18 adolescent measurements between 10 and 19 years of age and up to 10 middle-age 


measurements between 45 and 65 years of age, depending on how old they are as of 2010. 


Some relatives, spouses and family members of the participants enrolled in the FLS during 
adolescence or later produced their measurements afterward. It is, however, unreasonable to 
take all these repeated measurements as complete data for analysis because only 18% of the 
835 participants have at least one measurement in both adolescence and middle age. Instead, 
we realistically take the measurements at the time points of actual visits to a clinic for exam- 
ination and menarche as complete data and assume data missing at random (MAR, Rubin 
1976). As the spaghetti plots in Figure 1 show, the actual timings of examinations varied, 
and BMI appears semi-regularly measured while SBP looks more sparse and unbalanced. 

Family relocations due to job transfers or newly acquired jobs, sickness of participants 
or family members accompanied to scheduled examinations, or other family emergencies or 
situations may have caused participants to miss scheduled examinations or attrite. During 
the scheduled examinations, 12% of the adolescents and 18% of adults failed to have their 
BMI and SBP measured, respectively, for some reasons. Menarche was either identified at a 
scheduled examination or obtained from participants’ recollections, but missing otherwise. 
It seems plausible that the missing patterns are not related to missing values themselves. 
As a reviewer pointed out, however, it is possible for an adolescent to have a high BMI 
worsened since the last examination and fail to participate in the next scheduled examination 
because of embarrassment or anxiety. As the reviewer also commented, participants who have 
adolescent BMI measurements but who have no SBP measurement as adults are likely to be 
unhealthy and have high SBP in middle age. The missing BMI and SBP will be associated 
with missing probability to violate the MAR assumption. However, because the missing rates 
on BMI and SBP are modest, because 40% of the female participants having adolescent BMI 
measurements are younger than middle age as of 2010, and because most adolescents appear 
to have attended most follow-ups in the spaghetti plots, even if some missing BMI and SBP 
are NMAR, they will not seriously affect our analysis. 

We leverage the FLS data collected longitudinally on individuals enrolled over time to 


estimate the age, cohort and historical time effects separately. In large-scale single cohort 


studies such as the Dunedin Longitudinal Study and the Environmental-Risk Longitudinal 
Twin Study (Caspi et al. 2016; Moffitt et al. 2011), age and historical time are perfectly 
correlated. The confounded effects of age and childhood factors may explain only modest 
effect sizes of childhood risk factors on adult health outcomes reported (Felitti et al. 1998; 
Roberts et al. 2007; Moffitt et al. 2011; Caspi et al. 2016). On the contrary, age and 
cohort are perfectly collinear in a cross-sectional study of multiple birth year cohorts such as 
National Growth and Health Studies (Ren and Shin 2016). The multi-cohort FLS enables 
us to control for temporal, cohort and historical sources of SBP, and estimate the main 
effects of growth rates in adolescent BMI and menarche, and the growth rates-by-age and 
menarche-by-age interaction effects on middle-age SBP by a multilevel GCM. These effects 
may also change over cohorts and time. 

Researchers have studied the impact of longitudinal growth patterns in childhood obesity 
on adult health or the effects of childhood covariates on a longitudinal adult outcome in linear 
and nonlinear mixed models (Eriksson et al. 1999; Law et al. 2002; Ferreira et al. 2005; 
Nooyens et al. 2007; Sabo et al. 2012; McLeod et al. 2018; Sabo et al. 2014). Kim et 
al. (2016) analyzed FLS longitudinal childhood BMI to predict the person-specific timing 
of the BMI rebound, and subsequently analyzed the impact of the timing on a longitudinal 
adult cardiac outcome. Sabo et al. (2017) analyzed the FLS childhood BMI to predict child- 
specific ages and BMIs at the BMI rebound and maximum BMI growth, and subsequently 
estimated the effects of the predicted childhood covariates on longitudinal adulthood blood 
pressure outcomes. These studies have assessed the impact of either adult characteristics on 
a longitudinal childhood outcome or childhood covariates on a longitudinal adult outcome. 

In structural equation models (Bollen 1989), a longitudinal model for the repeated mea- 
surements of an outcome nested within individuals may be efficiently estimated by latent 
growth modeling where age variables are considered as fixed factor loadings and random co- 
efficients are latent variables (Willett and Sayer 1994; MacCallum et al. 1997; Bauer 2003; 
Bollen and Curan 2006; Preacher et al. 2008; Grimm and Ram 2009; Ram and Grimm 2015). 


Growth-mixture modeling extends the latent growth model to identify unobserved subpopu- 
lations exhibiting different growth trajectories (Wang and Bodner 2007). These approaches 
model longitudinal outcomes measured on a common set of time points. 

In the joint modeling approach (Gueorguieva 2001; Ivanova et al. 2016), a mixture of 
discrete and continuous longitudinal outcomes may be modeled in a joint mixed model. Data 
MAR in the model may be imputed by univariate sequential regression models (Raghunathan 
et al. 2001), also known as multiple imputation by fully conditional specification (van Buuren 
et al. 2006; van Buuren 2011). A multilevel GCM given data MAR may also be efficiently 
estimated by maximum likelihood (ML) or Bayesian methods (Liu et al. 2000; Schafer and 
Yucel 2002; Goldstein and Browne 2002; Goldstein et al. 2009; Goldstein and Kounali 2009; 
Shin and Raudenbush 2007, 2010, 2020; Ren and Shin 2016). These approaches typically 
apply to longitudinal outcomes measured at a common set of time points. 

In this paper, we estimate a nonstandard multilevel GCM for middle-age SBP that in- 
troduces challenges. First, the key covariates are unobservable adolescent-specific growth 
rates in BMI that have to be estimated from sample data. Sample average growth rates, for 
example, are unreliable measurements of the true growth rates that are known to introduce 
bias in the estimated effects of the growth rates (Liidtke et al. 2008; Shin and Rauden- 
bush 2010, 2020; Grilli and Rampichini 2011). Furthermore, missing data are substantial 
on BMI, SBP and menarche. Finally, longitudinal m; BMIs in adolescence and n; SBPs in 
middle age are measured at two separate sets of unbalanced time points nested within each 
person, and subjects may have measurements in either adolescence or middle age, or both. 
To handle missing data efficiently, we may express, as a special case of the joint modeling 
approach, a bivariate multilevel GCM for BMI and SBP and a linear model for menarche 
jointly where correlated person-specific random coefficients and menarche are multivariate 
normal and the variance covariance structure is appropriately constrained. Computationally 
efficient estimation of the joint model, however, involves derivation of new estimators and 


considerable amount of programming in a way that fully leverage the longitudinal structure. 


Our method is tailored to the structure with one set of outcomes repeatedly measured at un- 
balanced time points during childhood and another set longitudinally measured at separate 
unbalanced time points as an adult within each individual, thereby achieving computational 
efficiency. 

Viewing BMI, SBP, menarche and random coefficients as complete data, we analyze all 
observed data to estimate the joint model efficiently by ML via the EM algorithm (Dempster 
et al. 1977; Dempster et al. 1981; Shin and Raudenbush 2007). At convergence, we compute 
standard errors by the approximate Fisher score (Hedeker and Gibbons 1994; Raudenbush 
et al. 2000; Olsen and Schafer 2001). Subsequently, by the delta method (Casella and Berger 
2002), we transform the estimated joint model to the desired GCM for SBP that includes 
the random coefficients of adolescent BMI and menarche as covariates. An alternative is 
to draw multiple imputation of completed data, including the random coefficients, from the 
estimated joint model and estimate the desired GCM given the multiple imputation (Shin and 
Raudenbush 2007). This approach requires a cumbersome extra step of multiple imputation. 
We choose the delta method that demanded less programming than the alternative. We will 
demonstrate unbiased estimation by simulation. Our findings may provide important policy 
implications to promote adult health based on juvenile obesity history. 

Although FLS has not selected participants with respect to factors known to be associated 
with body composition, health, and other related conditions (Roche 1992), the participants 
are far from being randomly assigned to levels of key covariates, BMI growth rates and 
menarche. Furthermore, with adolescence far apart from middle age in time, there can 
be a number of confounders of the key covariates that we have not considered such as 
socioeconomic and environmental covariates unavailable in FLS. Consequently, the effect we 
mention in this paper is associational, not causal. 

The Fels data set is not publically available. However, the data set analyzed in this 
paper will be available from the second author upon reasonable request. The next section 


introduces our model. Section 3 explains how to estimate the model and compute standard 


errors given data MAR. Section 4 evaluates the accuracy and precision of the estimation 
by simulation. Section 5 presents analysis of the multi-cohort FLS sample data. The final 


section discusses the limitations and future extensions of the approach. 


2 Model 


Following Raudenbush and Bryk (2002), we express the repeated measurements of middle- 
age SBP R,,; and adolescent BMI C;,; for adult occasion 7 and adolescent occasion ¢ within 


person j in a level-1 model 


Rig = ApjOrj + Bry Yr + €rij, €rig ~ N(0,or), (1) 


Cy = Ab Bc; + Boyer + €cy, €ciy ~ N(O, occ) (2) 


where Aj,,Gpj is a polynomial in adult age aj for vectors Agi of age terms and Bp, of 
person-specific age effects (e.g. Ahi; = [1 a,,;| and Bry = (Gro; Prij|), Brij is a vector of 
known covariates (e.g. historical time) having fixed effects yaru, AbijsFo3 is a polynomial in 
adolescent age a;; for vectors Ac;; of age terms and 6c; of adolescent-specific age effects (e.g. 
Abe =| lige az, and Be; = [Bco; Pci; Pc2;|), Boj is a vector of known covariates having 
fixed effects yo (e.g. ay, and historical time), and occasion-specific random errors €R;; and 
€ctj are independent for 7 = 1,---,n,;,t =1,---,m,; and j = 1,---, J. With adolescence and 
middle age far apart within each person, it appears reasonable that occasion-specific random 
errors €;j; and €cy; are uncorrelated given the person-specific growth trajectories Gr; and 
Bc; and time-varying covariates. That is, given the covariates, SBP and BMI are dependent 
on each other only through the dependence between the trajectories. 


The aging effects Gr; on SBP vary between individuals according to a level-2 model 


Bry = Dro; +0;, v3 ~ N(0,7T) (3) 


for a matrix Igo of fixed effects; a vector U; = (Wg, Y:5 66,]" of known covariates Wo; 
(e.g. cohort), partially observed covariates Yj; (e.g. menarche) and unobservable growth 
rates Bc; in adolescent BMI; and a vector of random effects v; independent of random errors 
and U;. Conditional on v; and covariates, SBP and BMI are independent. Although the 
growth trajectories may also vary differently across subpopulations to produce non-constant 
variances of random coefficients, for example, between males and females, we believe that the 
constant variance covariance matrix 7 is plausible within our analysis of females, controlling 


for U;. Equations (1) and (3) imply our desired GCM (Raudenbush and Bryk 2002) 
Rij = Api’ ro; + Brig Ya + Angry + €riy (4) 


for a matrix of fixed effects [po including the effects of aging moderated by U;, the fixed 
effects yri of Brij, and the random effects v; of Ari; independent of €R;;. 

Estimation of GCM (4) is difficult because (Ri;,C1;, Y2;) are partially observed and be- 
cause latent trajectories Bc; are unobservable. We introduce efficient and unbiased estima- 
tion below. For a positive integer n, let J, be an n-by-n identity matrix, 1, be a vector of n 
unities, a diagonal matrix @;j_, A; = diag{Aj,:--, An}, probability density function (pdf) 
f(A) and conditional pdf f(A|B). 


3 Efficient Estimation 


For efficient estimation by all observed data, we estimate the joint distribution of (R;;, C1;, Y2;) 
MAR given known covariates. We view q random coefficients 5); = [BR Be,]" and py vari- 
ables Y2; as level-2 complete data in a model f(Y;;) 


Y3, = Xo;72+.b;, b; ~ N(0,T(¢r)) (5) 
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for Yo, = [67 Yo5]", a matrix of covariates X2; = diag{X21;,X22;} having fixed effects 


21 bij Ti The eit 
v= , random effects b; = 7 | and T (or) = having distinct 
“22 bo; To, The 
elements @r for X21; = Iq ® Ws; and X20; = In ® Wy; Each outcome in Yo may control 


for a different subset of W2; as we do in Section 6. Let 02 = [ys oF)". 

We aggregate R; = [Ri;--- Hs \P and C; = [Cy;--: Gal" to view Yi; = [Re Cn. and 
Y3; as complete data and observed values of Yj = baa Yel as observed data for person J, 
and estimate the joint model f(%4;, ¥o5) = f(¥%1j{F1;)f(¥s5) given known covariates by the 


EM algorithm. We aggregate Equations (1) and (2) as the level-1 model f(¥1;|41;) 
Yyy = Ayby + Bynt+ ay, ey ~ N(0,¥5) (6) 


of person j where Aj; = diag{Ap;,Acj}, bij = [BR; 66," By; = diag{Br;, Boj}, V1 = 
lym Yer”, 13 = [eky Gj)" and oY; = diag{In,orr,Im,occ} for Ary = [Arij---Arn,sl", 
Ag; = [Acij +++ Aems jl" s Bry = [Bray ++ Brnyjl"s Bos = [Bory ++» Bemyjl”s €ng = [erty + €rny 9)” 
and €q; = |ecij*+ + €Cm,3]. Let Ox = lye, Ore)’ and 6¢ = (yg, occ)’. 


Equations (5) and (6) imply a linear mixed model 


Yio = Xyy+ 27 +6 (7) 


for Y; = X= Li and €; = 
Yo; X99; 0 Ine 0 


Vig 0 Joe? Ay; 0 €15 
where X1; = [By; Ai; Xo1;] and yy = [97 aq)" a parameters are 0 = (Or, 0c, 02). 


3.1 The EM Algorithm 


We view complete data as (Y1;, b1;, Y2;) for individual j, equivalent to (€1;, b1;, b2;) = (€1;, 0;) 
given known covariates and @ for aj = bar = Ai; 41; = By and bo; = Yo; = X99;722- This 


approach simplifies expressions for the complete data likelihood, E-step, M-step and standard 


error estimation. The complete data likelihood is 


L(Gler;, 6 Ils, a (11 F (€rig|b1;) I] Heos)) f (bj). 
i=1 i=1 


For the E step, we define matrix O2; of ones and zeros to select observed data Yo; = 
O2;Y2; ~ N(0,T22;) from Yo; for To2; = Oxj T2205; (Shin and Raudenbush 2007). Likewise, 
we define Or;, Oc; and O1; = diag{Or;,Oc;} to find observed data Ry; = Or; R; of length 
Neji Coy = Oc;C; of length m,;; and You; = 01,4; = [Ri CE)" . Equation (7) implies f(Y,,) 


ves = Rag = 2055 + €oj ™ Cera Vi5) (8) 


where Yor = [Yo1y Yooj]?, Xoj = déag{Xorj,Xo22j}, Zog = diag{Aorj, O25}, €o5 = [€o1; 07)" 
and Voj = ZojT Z5, + diag{Woj, 0} for Xo1j = O1jX1j, Xo223 = Ory X225, Aory = Ory Ary, €o1y = 
Orjé1; and Wo; = diag{In,,7RR; Im,,7cc}- For simplicity of notation, let €(A) = E(A|Y,;) 
and V(A) = var(A|Y,;). The observed-data joint model f(Y,;) implies 


bi|Yoj ~ N [E(O;), V(Bs)], eng Yog ~ N [E (eng), Veeas)) (9) 


for the E components V(b,;) = Ae cov(by;, bo;|Yo;) = AF 2135; 


V(b25) = Qo25 — Marg (QTy — QAP? OG5) O15, 

E(byy) = Az? [Ad bo; dor + 25 E (br; [Yoo,)] 5 

E(baj) = Qaijy; [E(b1y) — E(b1y|Yous)] + E(b25|Yoos), 
Vey) = 0 — WOT, (by — vey AoAj Abi jo; Oudy, 


E(e1;) = PjOT;Wo [dory — Aorz (b1j)] 


where Qui; = cou(bx;, bi;|Yo2;) = Trt — T2051 93;02;T 1, A; = = Abby 5 Ady + Ons dot; = 


og = pig Yui and E (be; | Yo2) => T2035; T 935402 for do25 = Yo23 — X 6225/22 and k,l = 1,2. 


10 


Given 0 from the previous iteration, we compute the expected complete-data ML 6 


7 je ERs) 
COR) = Vase bs bth, S > BrisE (Enis), E(Frr) = ge 
jst jut 
= 
4 , wit E(€Cry) 
Ee) = ye+ (= Bout S> BoyE(ecy), E(Gcc) = a ae 
jit jit 


all 
2 z R ~ St, E(b,0! 
eee): nt (xbr x) DOXGT Eb), Ef) = = 
j j 


where }7 4, = 0; La Nr = 00,7; and Nc = D0, mj. 
Let Yo = (Yor, Yoo,---, Yor) be of length N,. Then, the log-likelihood | = I(@|Y,) = 


5 log[ f (Yous! Yous) f(You;)] is 
l = == 3 { log |Qu1;| + log |A;| + n,; log arr + D8 Zoo] + do1;05; [dory — AorzE (O15) 


- 1 
E(b1;[Yoag)? Q7y [LE (b1y|Yous) — cb) +a Pa-+ Teas} Nt 


3.2 Standard Errors 


At ML 6 = 8, we express the log-likelihood | = >2; 4; and score S = 57, S; for 


Ol; O log g; 
= log f(Y.; = log f | g;dby;,dYn;, S,;=— =e( i) 
( i) ee ONG j j a6 00 


where Y; = (Yo;, Ym;) for kj missing values Y,,,;, and estimate var(6 )) = ODrrey, 397) ‘lex bY 


the approximate Fisher score (Bock and Lieberman 1970; Raudenbush el al. 2000; Olsen 


and Schafer 2001). Let E = oar and 5? =[Se Se StF)" for.o,. = ee, c= a 


and $3, = eae . Based on the posterior distributions (9), we compute S$; = €(S;) where 


Sap doi BrigE (Eniy) ye Bey UacE€ (€or) 
RR ; B] J aa t : J ,E(S3;) = _ 
eer 5G; 5 vec(G9;) 


MP AE (b: 
E(S%) = 23 ( i) 
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for Grj = 20; E(€Rij)/CRa—2s/TRR, Goj = Ly E(Cerj)/Tec—mj/acc and Goi = T~*E(bjb; )T 1 — 
T~!. Next, we transform the estimated joint model to the desired GCM by the delta method. 
Specifically, each parameter 6 of the GCM is a function g(@) of 0. We take the first-order 
Taylor-series approximation of g(0) to estimate 6 and associated standard error. 

To find the initial values of 6, we estimate two GCMs f(Rj;, Y2;) and f(Ci;, Yo;) given 
known covariates, implied from the joint model in Equations (1), (2) and (5), by efficient ML 
estimation (Shin and Raudenbush 2007); use estimated f(Ri;), cou(Ri;, Y2;) and f(Ci;, Yo;); 
and set cou(R;;,C1;) to zero. We estimated all models via a C program written by the first 
author!, and simulated data and implemented the delta method within the R environment 
(R Core Team 2017) on a Dell XPS PC with Intel’s i7 processor. In our experience with 
this estimation method so far, added random coefficients increase computational burden 


noticeably, but added time points do not. 


4 Simulation Study 


To assess our estimation, we simulate in sequence W; ~ Bernoulli(0.5) simulating a dummy 
indicator of a group of birth-year cohorts, Y2;|W; ~ N(1 + W;, 722) simulating the stan- 
dardized menarche and a vector of four correlated random coefficients 8,;|W;, Y2; ~ N(14+ 
14W;+14Yo;, Ti2) describing the linear growth trajectories of adolescent BMI and middle-age 
SBP at level 2 for Tx. = 1, 61; = [Broj Brij Bco; Berl", Ty2 = var(61;|Y2;) having variances 
equal to 1 and covariances equal to 0.2 so that W2; = [1 W;]” in Equation (5). Given (1;, 
we simulate SBP Ri; ~ N(@roj; + Brijaiy + a‘, 1) at age a;; = —9, —3,3,9 centered at the 
midyear 55 of middle age in Equation (1) and BMI Cy; ~ N(8co0; + Berjaej + ar, 1) at age 
a; = —4.5,—-3.5,---,3.5,4.5 centered at the midyear 14.5 of adolescence in Equation (2). 
Sample sizes are n; = 4, m; = 10 and J = 500 to reflect comparatively sparse middle-age 


SBP of FLS sample data analyzed in the next section. We simulate 1,000 data sets. 


‘Convergence to ML is taken to be less than 10~° in the difference between | values of two consecutive 
iterations. 
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Next, we draw missing values of R;;, Ct; and Y2;. For each simulated data set, we simulate 
missing indicators Mri;, Morj and Myo; of Rij, Ci; and Yo;, respectively, according to the 


following mechanisms 


logit(pi;) = do + dW; + Uj; logit (pz;) = do + OW; + Uj; logit(p;) = do + OW; 


for independent u; ~ N(0,1) and v; ~ N(0,1) such that Mri; ~ Bernoulli(p;;), Mor ~ 
Bernoulli(py;) and Myo; ~ Bernoulli(p;). We simulated d9 = —1.5 and 6; = —1.5 to obtain 
Piz = 0.15 (0.07 for W; = 1 and 0.22 for W; = 0), d9 = —2 and 6, = —1 to yield p,; = 0.11 
(0.07 for W; = 1 and 0.16 for W; = 0), and 69 = —1 and 6; = 0.8 to simulate p; = 0.36 (0.45 
for W; = 1 and 0.27 for W; = 0) on average. Missing values are MAR because Mri;, Moi; 
and My»; depend on known W;. The missing rates closely reflect those of SBP, BMI and 
menarche in the FLS sample. As a result, 4.1, 17.0, 48.5, 126.1, and 304.4 individuals have 
0, 1, 2, 3 and 4 SBPs observed, respectively, on average in the 1,000 observed data sets. 

From the simulated bivariate distribution of BMI and SBP given Yj; and W,;, we compute 
and show the implied simulated GCM (4) for SBP conditional on the random coefficients 
Boo; and Boy; of BMI, W; and Y2; under column heading “simulated” in Table 1. Next, 
we illustrate our approach for estimation of the joint model f(Ri;, Ci, Y2;\aij, 143, W;) in 
Equations (1), (2) and (5), i.e., Equation (7) by the EM algorithm and the subsequent delta 
method transforming the estimates to yield the estimated simulated GCM (4). We repeated 
our approach to estimate the simulated GCM (4) given each of the 1,000 simulated data 
sets to produce average estimates with associated average standard errors, biases and mean 
squared errors under heading “complete data EM” in Table 1. The average estimates are 
really close to the simulated parameters with the standard errors in parentheses, biases and 
mean squared errors comparatively very small. 

Given partially observed data, we again obtain the average estimates of the simulated 


GCM (4) by our approach and list them under heading “observed data EM” in Table 1. 
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The average estimates are very close to the simulated parameters. Consequently, the biases 
appear very small near the counterparts under complete data EM. The standard errors and 
mean squared errors are 5% to 28% and 0 to 57% larger, respectively, than those of complete 
data EM to reflect added uncertainty due to missing data. Estimation of the joint model by 


the EM algorithm took only a few seconds to converge to ML in average 22 iterations. 


5 Data Analysis 


We now analyze the 835 female participants of FLS. Controlling for age, birth year cohort, 
and historical time as a visit year to a clinic for measurement, we estimate a multilevel GCM 
in Equation (4) for the effects of growth rates in adolescent BMI and menarche on middle-age 
SBP. The covariates may also strengthen or weaken the effect of aging on SBP. We discuss 
the statistical significance of an estimate at a significance level 0.05. 

Table 2 summarizes sample data. At level 1, adolescent BMI and middle-age SBP range 
12.32 to 46.61 and 75 to 210, missing 12% and 18% of the values, respectively. Age is centered 
at midyear 14.5 of adolescence (10 to 19 years of age) and 55 of middle age (45 to 65 years 
of age). Onset is 1 from diagnosis of a cancer and 0 otherwise. Visit years start from 1930 in 
adolescence and 1942 in middle age. We denote the last two digits of visit years ranging aa 
to bb by a dummy indicator “vyraa_bb.” In middle age, vyr42_50 has a single SBP observed, 
and vyr91_00 and vyr01_10 show no difference in mean SBP by the t test so that we analyze 
vyr42_60, vyr61_70, vyr71_80, vyr81_90 and vyr91_10. In adolescence, we analyze vyr30_40 
to vyr01_10 shown in Table 2. At level 2, menarche ranges 9 to 17 years of age, missing 37%. 
Birth year cohorts range 1889 to 1963 in middle age, and 1912 to 1997 in adolescence. We 
denote the last two digits of birth years ranging aa to bb by a dummy indicator “byraa_bb.” 
A single SBP is observed in byr89_00, and no evidence of differences was found in mean SBP 
between byr00_10 and byr10_20 and in mean BMI between byr80_90 and byr90_97 by the t 


tests. Thus, we analyze byr_20, referring to birth years from 1989 to 1920 in middle age and 
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from 1912 to 1920 in adolescence, to byr80_97 shown in Table 2. We standardize BMI and 
SBP to have mean 0 and variance 1, and center menarche at midpoint 13 years of age. 

For efficient estimation of the GCM by all observed data, we estimate the joint distribu- 
tion of R;; =SBP, Ci; =BMI and Y2; =menarche given known covariates in Equations (1), 
(2) and (5) for AR,; = [1 ay], Bai; =[onset vyr42_60 vyr61_70 vyr71_80 vyr81_90] with the 
latest vyr91_10 as the reference level, Abt = [1 ay af; a};], Bory having adolescent onset 
and seven visit year indicators with the latest vyr01_10 as the reference, and W 2; having 
1 for the intercept followed by the seven cohort indicators with the latest byr81_97 as the 
reference. In middle age, birth years range up to 1963 with 9 individuals in byr61_70 and no 
difference in mean SBP found between byr51_60 and byr61_70 so that we analyze byr51_70 
as the reference and set the effects of later cohorts on Gp; to zero. Preliminary analysis by 
efficient ML estimation of two hierarchical models f(Rj;, Y2;) and f(C1;, Y2;) given known 
covariates implied by the joint model (Shin and Raudenbush 2007) not only provided the 
initial values of 0, but also enabled a series of likelihood ratio tests (LRT) that supported 
the first-degree polynomial 8ro; + Grijai; in middle age and the third-degree polynomial 
Boo; + Berjty + Borjaz; + Bc3j;a?, in adolescence (Stram and Lee 1994). 

We attempted to simplify the joint model having 93 parameters, consisting of 63 fixed 
effects, 28 parameters in ¢r, and 2 error variances o¢¢ and Orr. We identified and dropped 
eight BMIs and one SBP influential on the estimates by plotting the empirical Bayes esti- 
mates of random effects, and found some cohort effects different from zero on all random 
coefficients but Gri; by LRT producing the p-value 0.31. With no cohort effects on 6p1,;, 
we tested if the aging effect on SBP is randomly varying by testing var(Gp1;) = 0. We also 
tested if the cubic age effect on BMI is random by testing var(8c3;) = 0. The large-sample 
distribution of the LRT statistics is a mixture 0.5x% + 0.5? (Stram and Lee 1994). The 
LRT statistics yield respective p-values 0.0002 and 0 in support of the random coefficients. 
Likewise, we found other age coefficients on BMI randomly varying. 


Next, we transformed the full model to the desired GCM f(Ri;|8¢;, Y2;) given known 
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covariates where Y;; and Go; are Y2; and Bc; centered at the respective birth-year means 
from Y3; — X2;72 in Equation (5). Consequently, the cohort effects on Br; are marginal in 
the sense that they control for neither menarche nor BMI growth rates. The joint model 
also yields f(Y2;) and f(C1;|Y3;) given known covariates. 

We show the estimated parameters of f(Y2;), f(Cij|¥o;) and f(Rij|8¢;, Y2;) in Table 
3. The estimates of f(Y2;) under column heading “f(Y2;)” reveal that the average age at 
menarche is 12.47 years old in byr81_97 (the intercept -0.53 plus the centered age 13 of 
menarche), younger than those in earlier cohorts although the cohort gaps are significant 
only in byr_20, byr51_60 and byr61_70. The estimates of f(C;|Y3;) in the subsequent column 
under heading “f(C;;|Y2;)” reveal that for a 14.5 year-old adolescent, the expected BMI in 
byr81_97 is 0.67 standard deviations (SD) above the mean BMI and higher than those in 
earlier cohorts with the significant gaps in byr31_40 and later cohorts, and BMI decreases 
0.39 SD for each year delayed in menarche on average controlling for onset and visit years. 
The linear age effect in byr81_97 is positive and higher than those in byr31_40 and byr51_60, 
and the quadratic and cubic age effects are negative and near zero in byr81_90, respectively, 
with no significant gaps across cohorts on average, controlling for the covariates in the model. 
The person-specific linear and quadratic growth rates Bco1; and Bc2; in BMI are positively 
and cubic rate 8¢3; is negatively associated with menarche, ceteris paribus. 

The estimated parameters of the desired GCM f(Ri;|8¢;, Yo;) in Equation (4) appear in 
the last column under heading “f(Rij|8¢,, Y2;)”. For a 55 year-old woman, the growth rates 
in adolescent BMI and menarche are positively associated with SBP controlling for cohorts, 
onset and historical time, but these associations are statistically insignificant. The aging 
effect is positive and significant on SBP, but not significantly moderated by the growth rates 
in adolescent BMI and menarche, ceteris paribus. Women in earlier cohorts (earlier visit 
years) have higher (lower) expected SBP than do those in byr41_70 (vyr91_10), controlling 
for age, onset and historical time (cohorts). However, the effect sizes are at most modest. 


Overall, the aging effect is significant on SBP, controlling for cohorts and historical time, but 
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the cohort and temporal effects are not, controlling for age. To check if we might have found 
the aging effect by chance, we used the Bonferroni adjustment for testing 21 fixed effects to 
confirm that the aging effect is indeed significant. 

Figure 2 graphs the predicted growths in adolescent BMI and middle-age SBP controlling 
for onset and historical time of measurement, respectively. Most strikingly, BMI is not only 
higher at the beginning of, but also grows faster throughout adolescence in byr81_97 than 
in previous cohorts such that the differences between byr81_97 and other cohorts grow to 
range 0.66 to 1.08 SDs at age 19. On the contrary, we see that the predicted progression of 
middle-age SBP barely changes across the cohorts up to byr61_70. This may be because the 
middle-age measurements in byr81_97 are yet unavailable. When they become available, the 
impact of the rapid growth in adolescent BMI on middle-age health can be more pronounced. 

Figure 3 exhibits the normal probability plots of the level-2 and -1 residuals from the 
joint model that produced the estimates in Table 3. Modest heavy tails are noticeable in 
some level-2 plots, in particular, for those of the random intercepts. This may be because 
as many as 296 adolescents and 139 adults have only zero to 2 repeated measurements 
observed in BMI and SBP, respectively, essentially downsizing effective sample sizes for the 
model fit. This may also be because of omitted covariates we were unable to control such as 
socioeconomic and environmental covariates unavailable in FLS. A nonnormal model more 
flexible with heavy tails might have resulted in an improved fit. Other than the moderate 
heavy tails, the residuals do not appear to deviate severely from the assumed model. 

We repeated the analysis with an extra covariate, indicating if an individual was taking 
medication at the time of measurement, added to Bri; and Boy; of the joint model. The 
total number of visits on medication happened at 22 adolescent and 53 middle-age occasions. 
Neither the significant effect of the covariate was found on either BMI or SBP, nor did the 


added covariate change the estimates in Table 3 in a notable way. 
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6 Discussion 


We analyzed the repeated measurements of adolescent BMI and middle-age SBP nested 
within 835 female participants of FLS to estimate the lasting effects of growth rates in BMI 
and menarche, including the growth rates-by-age and menarche-by-age interaction effects, 
on SBP later in middle age via a multilevel GCM, controlling for onset of cancer, age, birth 
year cohort, and historical time of measurement. Estimation of the GCM was nonstandard 
and computationally intensive because BMI and SBP were repeatedly measured on two 
distinct sets of unbalanced time points nested within each participant, and missing values 
were substantial on BMI, SBP and menarche. 

For efficient estimation by all observed data, we estimated the multilevel joint distri- 
bution of SBP, BMI and menarche assumed MAR given known covariates by ML via the 
EM algorithm. A series of likelihood ratio tests revealed that the coefficients of the linear, 
quadratic and cubic age terms on BMI and linear age on SBP vary randomly across indi- 
viduals and that cohort effects are significant on menarche and random coefficients except 
the random slope of age on SBP. We transformed the estimated joint model to the desired 
GCM by the delta method. Because we analyzed FLS female participants who are mostly 
white, our findings are most appropriately applicable to European Americans. We illustrated 
unbiased estimation by our method via simulation. 

The estimated mean growth trajectory in adolescent BMI is much higher among individ- 
uals born in 1981 to 1997 than those in earlier cohorts such that the mean differences are 
most striking, ranging 0.66 to 1.08 SDs on average, by age 19 as shown in Figure 2. Because 
middle-age measurements of those born in 1981 or later are not yet available, we are unable 
to show the impact on middle-age SBP. When they become available, we anticipate that the 
impact will strengthen quite remarkably. 

We found only modest effect sizes of growth rates in adolescent BMI and menarche on 
SBP in our analysis. That may be because we have ignored an important family level. 


Because 376 family members and relatives who accompanied 459 participants enrolled at 
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birth to clinics for measurement were also measured, analysis of the repeated measurements 
of individuals nested within families would have resulted in more efficient estimates, which 
will be an important future analysis. 

Because FLS started collecting socioeconomic covariates such as income and education 
quite recently since the year 2002, we could not control for the effects of such covariates. The 
omitted covariates may have contributed to the modest effect sizes we found. A valuable 
future research is to reanalyze the GCM, combining FLS with other longitudinal samples 
that have collected extensive socioeconomic covariates. Our missing data method can be 
extended to handle such analysis of multiple samples. 

The FLS female sample comprises 1,196 occasions of 436 subjects to give 2.7 occasions 
per middle-age female. In addition, missing data are substantial on BMI, SBP and menarche. 
Consequently, the FLS female sample sizes may not have provided adequate power to detect 
the desired effects. Furthermore, adolescence and middle age may be too distant to reveal 
significant associations between BMI and SBP, possibly, due to confounders. In near future, 
we want to reanalyze the model in a way that increases sample sizes and also in a way that 
involves closer timeframes. For example, we may estimate the effects of growth rates in 
teenagers’ obesity on an earlier adulthood outcome by the entire sample. 

Adolescent obesity may also be too distant in time to reveal direct effects on middle- 
age health. What we have found in this paper may be the evidence of a lack of the direct 
effects of growth rates in adolescent obesity on middle-age health, and important mediators 
may exist in closer future. For example, it is plausible that the effects of adolescent obesity 
on middle-age health are mediated by a near-future health outcome in earlier adulthood. 
Although we have illustrated the impact of a univariate BMI in adolescence on a univariate 
SBP in middle age, extension to multivariate health outcomes is straightforward. 

Under our current model, we assumed constant variances orr and ogc of occasion-specific 
or within-person random errors of middle-age SBP and adolescent BMI, respectively. In near 


future, we want to estimate non-constant error variances ORRj and OCC; of person j such 
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that the person-specific random coefficients of SBP are correlated with the within-person 
variances log(arr;) and log(acc;) (Hedeker et al. 2008). We will also consider the non- 
constant variances of the SBP and BMI growth trajectories, for example, across males and 


females in an extended FLS analysis. 
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Figure 1: Scatter and spaghetti plots superimposed for the unbalanced repeated measure- 
ments of adolescent BMI and middle-age SBP against age from 835 FLS female participants 
enrolled over time across multiple birth year cohorts. 


Table 1: Model (4) simulated 1,000 times for n; = 4, m; = 10 and J = 500 with simulated 
parameters under heading simulated. Complete data EM and observed data EM list the 
average estimates and standard errors (se) by the EM algorithm, bias and mean squared 
errors (mse) given completely observed data and data MAR, respectively. 


simulated complete data EM observed data EM 

estimate (se) bias, mse estimate (se) bias, mse 
1 0.67 0.66 (0.11) -0.005, 0.012 0.66 (0.13) -0.004, 0.015 
W; 0.67 0.67 (0.13) 0.007, 0.016 0.67 (0.14) 0.008, 0.020 
Boo; 0.17 0.17 (0.05) 0.001, 0.003 0.17 (0.06) 0.000, 0.004 
Bea; 0.17 0.17 (0.05) 0.001, 0.003 0.17 (0.06) 0.000, 0.003 
Yo; 0.67 0.66 (0.08) -0.003, 0.007 —-0.67 (0.11) -0.001, 0.011 
are 0.67 0.66 (0.10) -0.004, 0.010 0.66 (0.10) -0.005, 0.011 
aig W; 0.67 0.67 (0.12) 0.004, 0.013 0.67 (0.12) 0.004, 0.014 
i; BC0; 0.17 0.17 (0.05) -0.001, 0.002 0.17 (0.05) -0.001, 0.003 
ai; 2 C1; 0.17 0.17 (0.05) 0.002, 0.002 0.17 (0.05) 0.001, 0.002 
ai; Y2; 0.67 0.66 (0.08) -0.002, 0.005 0.67 (0.09) 0.000, 0.007 
a‘, 1.00 1.00 (0.00) 0.000, 0.000 1.00 (0.00) 0.000, 0.000 
T00 0.93 0.92 (0.08) -0.011, 0.006 0.92 (0.09) -0.011, 0.008 
Tol 0.13 0.14 (0.05) 0.002, 0.002 0.13 (0.06) 0.002, 0.003 
Til 0.93 0.92 (0.06) -0.009, 0.004 0.92 (0.07) -0.009, 0.004 
ORR 1.00 1.00 (0.05) -0.002, 0.002 1.00 (0.05) -0.002, 0.003 
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Table 2: FLS female data summary for analysis. “vyr” and “byr” stand for visit year and 


birth year, respectively. 


Level Mean(sd, missing) Mean(sd, missing) 
1 SBP 119.50 (18.44,18%) BMI 19.74 (3.53,12%) 
age -1.21 (5.71, 0%) age -0.56 (2.47, 0%) 
onset 0.10 ( 0.30, 0%) onset 1E-3 (0.03, 0%) 
vyr42.60 (0.07 (0.26, 0%) vyr30-40 0.03 (0.18, 0%) 
vyr61-70 0.17 (0.38, 0%) vyr4150 0.19 (0.40, 0%) 
vyr71.80 0.10 (.0.30,0%) vyr5160 0.16 (0.36, 0%) 
vyr81.90 0.07 (0.26,0%) vyr61-70 0.16 (0.36, 0%) 
vyr91.10 0.58 (0.49,0%) vyr71.80 0.18 (0.38, 0%) 
vyr81.90 0.11 (0.31, 0%) 
vyr91.00 0.11 (0.31, 0%) 
vyr01_10 0.07 (0.25, 0%) 
2 menarche 12.73 (1.20,37%) 
byr_20 0.20 (0.40, 0%) 
byr21_30 0.06 (0.24, 0%) 
byr31_40 0.12 (0.33, 0%) 
byr41_50 0.11 (0.32, 0%) 
byr51_60 0.17 (0.38, 0%) 
byr61_70 0.12 (0.32, 0%) 
byr71_80 0.07 (0.26, 0%) 
byr81_97 0.14 (0.35, 0%) 
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Figure 2: Predicted growth in adolescent BMI and middle-age SBP controlling for onset and 


historical time of a visit. 
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Table 3: Estimates (standard errors) of f(Y2;), f(C1j/¥2;) and f(Rij|8;, Yo;). 


Covariates f(Y2;) F(Ct3|¥3;) Covariates — f(Rij|BG;, Yo;) 
1 -0.53 (0.11)** 0.67 (0.08)** 1 0.03 (0.11) 
byr_20 1.74 (0.56) ++ -3.12 (9.80) byr_20 0.47 (0.36) 
byr21_30 0.40 (0.27) -0.45 (0.38) byr 21-30 0.43 (0.29) 
byr31_40 0.15 (0.16) -0.66 (0.17)+*+ — byr31_40 0.16 (0.19) 
byr41_50 0.32 (0.16) -0.44 (0.21)* byr41_50 -0.03 (0.14) 
byr51_60 0.49 (0.14) +t -0.48 (0.16)T* Boo; 0.10 (0.21) 
byr61_70 0.52 (0.15)TT -0.39 (0.15)TT = Ben; 0.78 (2.29) 
byr71_80 0.18 (0.16) -0.43 (0.19)* Bo2; 0.54 (6.95) 
Va -0.39 (0.05)t+ Bes; 8.99 (68.2) 
ae; 0.20 (0.02)*+ Yo, 0.01 (0.10) 
a,jbyr_20 4.85 (11.7) aij 0.05 (0.01)++ 
az; byr21_30 -0.04 (0.08) aij BCoj -0.01 (0.02) 
a,;byr31_40 -0.06 (0.03)+ ajo; 0.18 (0.16) 
a,jbyr41_50 -0.01 (0.03) aijBc2; 0.21 (0.96) 
a,jbyr51.60 -0.06 (0.03)+ — ai;Be3; 2.58 (5.02) 
a,jbyr61_70 0.00 (0.03) aijYa, -0.01 (0.01) 
a,jbyr71.80 -0.04 (0.03) onset -0.00 (0.15) 
GY 0.03 (0.01)*+ —vyr42.60 -0.33 (0.78) 
a’, -0.02 (0.00)+*+ —vyr61_70 -0.58 (0.32) 
aj byr_20 -2.36 (4.18) vyr71-80 -0.05 (0.25) 
a2,byr21_30 -0.005 (0.02) vyr81_90 -0.26 (0.16) 
a; byr31_40 -0.004 (0.01) 
a;,byr41_50 0.001 (0.01) 
a; byr51_60 0.002 (0.01) 
a; byr61_70 -0.001 (0.01) 
ai,byr7180 0.004 (0.01) 
a2, Yo; 0.004 (0.002)* 
a’, 0.000 (0.001) 
abyr_20 0.330 (0.470) 
ai, byr21_30 -0.003 (0.005) 
ai byr31_40 -0.000 (0.002) 
ai byr41_50 -0.002 (0.002) 
ai, byr5160 0.000 (0.002) 
a’,byr61-70 -0.003 (0.002) 
a, byr71.80 -0.000 (0.002) 
al. Yo; -0.002 (0.001) ++ 
onset -0.10 (0.15) 
vyr30_40 0.02 (0.17) 
vyr41_50 0.04 (0.11) 
vyr51_60 0.05 (0.09) 
vyr61_70 -0.01 (0.07) 
vyr71_80 0.00 (0.07) 
vyr81_90 0.03 (0.06) 
vyr91_00 -0.04 (0.04) 
T22 or var(Bo;) 0.95 0.67 0.02 -5E-3 4E-4" T 0.51 0.02 
0.02 -7E-4 -7E-4 0.001 
6E-4 7E-5 
4E-5 
Occ 0.036 ORR 0.401 


+ p-value < .05; ++ p-value < .01; *#E —y=a x 10749 
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Figure 3: Residual plots 


of the joint model that produced estimates in Table 3. 
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